"the arithmetic average of $n$ R.V.s. from a random sample of size $n$
$$ \bar{x} = \dfrac{1}{n} \sum_{i=1}^{n} x_i $$
or
"the arithmetic average of the realisation of $n$ R.V.s".
$$ \bar{X} = \dfrac{1}{n} \sum_{i=1}^{n} X_i $$
where $\bar{X}$ is an R.V. with an expectation and variance.
$$ \sigma^2 = \frac{\displaystyle\sum_{i=1}^{n}(x_i - \mu)^2} {n} $$
$$ E(\bar{X_n}) = \mu $$
$$ \bar{X} = \dfrac{1}{N} \sum_{i=1}^{N} x_i $$
$$ \sigma^2 = \frac{\displaystyle\sum_{i=1}^{N}(x_i - \mu)^2} {N-1} $$
$f_X(x)$ is the PDF of the R.V. X.
$F_X(x)$ is the CDF of the R.V. X.
$$ f_X(x) = \frac{d}{dx}F_X(x) $$
$$ E(X) = \int xf_X(x) $$
$$ E(aX + b) = aE(X) + b $$
$$ \textrm{Var}(aX + b) = a^2E(X) $$
$$ E(Y|X) = \int yf_{(Y|X)}(y|x) $$
where $E(Y|X)$ is a random variable.
$$ E(E(Y|X)) = E(Y) $$
$$ \textrm{Var}(E(Y|X)) + E(\textrm{Var}(Y|X)) = \textrm{Var}(Y) $$
$$ \textrm{Var}(X) = E(X^2)-[E(X)^2] $$
"one trial, two possible outcomes with probability of success, $p$."
$$ E(X) = p $$
$$ \textrm{Var}(X) = pq $$
"$n$ trials with $k$ successes given a probability of success, $p$."
For small $p$, the binomial distribution simulates the Poisson.
$$ {n \choose k} = \frac{n!}{k!(n-k)!} $$
$$ E(X) = np $$
$$ \textrm{Var}(X) = np(1-p) $$
Think: the number of combinations $\times$ the probability of a single combination occurring.
$$ = {n \choose k}p^{k}(1-p)^{n-k} $$
$$ P(X = k) = \dfrac{{K \choose k}{N-k \choose n-k}}{N \choose n} $$
where:
"number of trials, $n$, until success given a probability, $p$".
$$ E(X) = 1/p $$
$$ \textrm{Var}(X) = \dfrac{(1-p)}{p^{2}} $$
$$ = \dfrac{1-p}{p^2} $$
$N_{t}$ is an integer-valued R.V. corresponding to the number of events.
$\lambda = \gamma t$ where $\gamma$ is the propensity to arrive per unit of time, and $t$ is a number in units of time.
$$ E(N_{t}) = \lambda $$
$$ \textrm{Var}(N_{t}) = \lambda $$
$$ P(N_{t} = k) = \dfrac{\lambda^{k}e^{-\lambda}}{k!} $$
Waiting time between two events in a Poisson process:
$$ f_{x} = \lambda e^{-\lambda x} $$
if $x>0$.
Distribution is memoryless:
$$ P(X\geq t = e^{-\lambda t}) $$
$$ E(X) = \dfrac{1}{\lambda} $$
$$ \textrm{Var}(X) = \dfrac{1}{\lambda^{2}} $$
This distribution is continuous.
$$ E(X) = \dfrac{(a+b)}{2} $$
$$ \textrm{Var}(X) = \dfrac{(b-a)^{2}}{12} $$
$$ f(x) = \dfrac{1}{{\sigma\sqrt{2\pi}}} e^{-(x - \mu)^{2}/(2\sigma^{2}) } $$
Taking a linear transformation of a normally-distributed R.V. generates a normally-distributed random variable.
Some basic relations:
$$ \textrm{Cov}(X,Y) = \sigma_{XY} = E[(x-\mu_x)(y-\mu_y)] $$
$$ \textrm{Cov}(X,X) = Var(X) $$
$$ \textrm{Cov}(X,Y) = E(XY)-E(X)E(Y) $$
$$ \textrm{Cov}(aX+b,cY+d) = acCov(X,Y) $$
$$ Var(X+Y) = Var(X) + Var(Y) + 2Cov(X,Y) $$
$$ Corr(X,Y) = \rho_{XY} = \dfrac{\textrm{Cov}(X,Y)}{(\sqrt(Var(X))*\sqrt(Var(Y)))} $$
For any positive R.V.,
$$ P(X\geq t) \leq \frac{E(X)}{t} $$
for $t>0$
$$ P(|X-E(X)| \geq t) \leq Var(X)/t^2 $$
for $t>0$
"the definition of the CDF for the standardised version of the sample mean from any distribution, where the sample size $n$ tends to $\infty$, is that of the standard normal"
$$ \lim_{n\to\infty} P\left[ \dfrac{\sqrt{n}(\bar{X}-\mu)}{\sigma} \leq x\right] = \Phi(x) $$
or, in other words:
"CLT implies that were one to draw $n$ samples $X_1$,...,$X_n$ independently and identically, then for reasonably large $n$ each $X_i$ need not be approximately normally distributed, but the sample mean $\bar{X} = \sum_i X_i/n$ will be approximately normally distributed."
$\hat{\theta}$ is an estimator for the variable, $\theta$.
An estimator is efficient if the spread, Var($\hat{\theta}$), of the estimator is small.
An estimator is robust if it is resilient to errors arising from misspecification the underlying distribution of $\theta$.
MSE($\hat{\theta}$) = Var($\hat{\theta}$) + $\left[ E(\hat{\theta}) - \theta\right] ^{2}$
For an estimator, e.g. the sample mean $\bar{X_n}$:
$$ \textrm{Var}(\bar{X_n}) = \dfrac{\sigma^{2}}{n} $$
$$ \textrm{SE}(\bar{X_n}) = \dfrac{\sigma}{\sqrt{n}} $$